Scanning system for inspecting anamolies on surfaces

ABSTRACT

An optical scanning system and method for detecting anomalies, including pattern defects and particulate contaminants, on both patterned and unpatterned surfaces, using a light beam, scanning at a grazing angle with respect to the surfaces, a plurality of detectors and an interchannel communication scheme to compare data from each detector, which facilitates characterizing anomalies. The light beam illuminates a spot on the surface which is scanned over a short scan-line. The surface is moved in a manner so that the spot is scanned over its entire area in a serpentine fashion along adjacent striped regions. The plurality of detectors include groups of collector channels disposed circumferentially around the surface, a bright field reflectivity/autoposition channel, an alignment/registration channel and an imaging channel. The collector channels in each group are symmetrically disposed, in the azimuth, on opposite sides of the center of the scan line. The position of the collector channels, as well as the polarization of the beam, facilitates distinguishing pattern defects from particulate contaminants. The bright field reflectivity/autoposition channel is positioned to receive specularly reflected light that carries information concerning local variation in reflectivity, which is used to classify detected anomalies, as well as determine variations in the height of the surface. The alignment/registration channel is positioned to detect a maximum of the light scattered from the pattern on the surface to ensure that the streets of die present on the surface are oriented so as not to be oblique with respect to the scan line. The imaging channel combines the advantages of a scanning system and an imaging system while improving signal/background ratio of the present system.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. patent application Ser. No. 09/954,287, filed Sep. 11, 2001, which is a continuation of U.S. patent application Ser. No. 09/760,558, filed Jan. 16, 2001, now U.S. Pat. No. 6,636,302, which is a continuation of U.S. patent application Ser. No. 09/213,022, filed Dec. 16, 1998, now U.S. Pat. No. 6,215,551, which is a continuation of U.S. patent application Ser. No. 08/499,995, filed Jul. 10, 1995, now U.S. Pat. No. 5,883,710, which is a continuation-in-part application of parent application entitled “Optical Scanning System for Surface Inspection,” by Mehrdad Nikoonahad, Keith B D. Wells and Brian C. Leslie, Ser. No. 08/351,664, filed Dec. 8, 1994, now abandoned. This application is also related to the patent application entitled “Optical Wafer Positioning System,” by Mehrdad Nikoonahad, Philip R. Rigg, Keith B. Wells and David S. Calhoun, Ser. No. 08/361,131, filed Dec. 21, 1994 (“Related Application”), which has since issued as U.S. Pat. No. 5,530,550. Both prior applications are incorporated by reference herein in their entirety.

BACKGROUND OF THE INVENTION

This invention relates in general to surface inspection systems, and in particular, to a high speed scanner system for inspecting anamolies on surfaces such as semiconductor wafers, photomasks, reticles, ceramic tiles, and other surfaces.

The size of semiconductor devices fabricated on silicon wafers has been continually reduced. At the time this application is filed, for example, semiconductor devices can be fabricated at a resolution of a half micron or less and sixty-four (64) megabyte DRAMs are being fabricated with 0.35 micron design rule. The shrinking of semiconductor devices to smaller and smaller sizes has imposed a much more stringent requirement for sensitivity of wafer inspection instruments which are called upon to detect contaminant particles and pattern defects that are small compared to the size of the semiconductor devices. On the other hand, it is desirable for wafer inspection systems to provide an adequate throughput so that these systems can be used in production runs to detect defective wafers.

In U.S. Pat. No. 4,898,471 to Stonestrom et al. assigned to the present assignee to the present application, the area illuminated on a wafer surface by a scanning beam is an ellipse which moves in the scan direction. In one example given by Stonestrom et al., the ellipse has a width of 20 microns and a length of 115 microns. Light scattered by anomalies or patterns in such illuminated area is detected by photodetectors placed at azimuthal angles in the range of 80 to 100° The signals detected by the photodetectors are used to construct templates. When the elliptical spot is moved in the scan direction to a neighboring position, scattered light from structures within the spot is again detected and the photodetector signal is then compared to the template to ascertain the presence of contaminant particles or pattern defects as opposed to regular pattern. In Stonestrom et al., the scanning beam scans across the entire wafer to illuminate and inspect a long narrow strip of the wafer extending across the entire dimension of the wafer in the scanning direction. The wafer is then moved by a mechanical stage in a direction perpendicular to the scanning direction for scanning a neighboring elongated strip. This operation is then repeated until the entire wafer is covered.

While the system of Stonestrom et al. performs well for inspecting wafers having semiconductor devices that are fabricated with coarser resolution, with the continual shrinking of the size of the devices fabricated, it is now desirable to provide an improved inspection tool that can be used to detect very small size anomalies that may be difficult to detect using Stonestrom et al.'s system.

SUMMARY OF THE INVENTION

This invention is based on the recognition that very small anamolies can be detected by reducing the size of the area that is illuminated by the scanning light beam. Light scattered from structures in the spot will include background, such as light scattered by pattern on the surface, as well as light that is scattered by anomalies such as contaminant particles, pattern defects or imperfections of the surface. Such background can have a significant amplitude. For this reason, if the anamoly is of a size which is small compared to the size of the illuminated area, the scattered light from such anamoly may be overwhelmed by and become undetectable from the background. By reducing the size of the illuminated area or spot size, the ratio of the light intensity scattered by an anomaly to that of the background will be increased, thereby increasing detection sensitivity. However, with a smaller spot size, it will be more difficult to maintain the uniformity of the spot along a long straight scan line across the entire wafer. By breaking up the scan path into short segments, it is possible to employ a smaller spot size while at the same time maintaining uniformity of the spot along the path. From the system point of view, by reducing the length of the scan, the size of the collection optics for detecting forward scattered light becomes more manageable.

Thus one aspect of the invention is directed towards a method for detecting anamolies on a surface, comprising the steps of directing a beam of light at a grazing angle towards the surface, causing relative motion between the beam and the surface so that the beam scans a scan path covering substantially the entire surface; and collecting light scattered along said path for detecting anamolies. The scan path includes a plurality of arrays of scan path segments, wherein each of at least some of such scan path segments has a span shorter than the dimensions of the surface.

As used in this application, “minimum width” of the illuminated area or spot on the surface to be inspected is defined as the minimum dimension of a boundary around the area or spot along any direction on the surface, where the boundary is defined as the locations on the surface where the illumination light intensity is a predetermined fraction or percentage of the maximum intensity of illumination in the area or spot. In the description of the preferred embodiment, for example, the boundary is where the light illumination intensity is 1/e² of the maximum intensity of illumination in the area or spot, e being the natural number. The minimum dimension is the minimum distance between two parallel lines that enclose between them the boundary of the area or spot. The term “minimum width” is explained in more detail below.

Another consideration of the invention is to provide an adequate throughput while data is collected at a moderate rate for defect detection so that the data collection and processing system employed need not be overly complex and expensive.

Thus another aspect of the invention is directed towards a method for detecting anamolies on the surface of a semiconductor wafer, comprising directing a beam of light towards a surface to illuminate an area of the surface defining a spot having a spot size whose minimum width is in the range of about 5 to 15 microns, causing relative motion between the beam and the wafer so that the beam scans a path covering the entire surface; and collecting light scattered along said path for detecting anamolies. The spot size and the directing and causing steps are such that the beam scanning substantially inspects the entire surface of the wafer at a throughput in excess of about 40 wafers per hour for 150 millimeter diameter wafers, at a throughput in excess of about 20 wafers per hour for 200 millimeter diameter wafers, and at a throughput in excess of about 10 wafers per hour for 300 millimeter diameter wafers.

Yet another aspect of the invention is directed towards a method for detecting anamolies on a surface, comprising the steps of directing a beam of light towards the surface to illuminate an area of the surface defining a spot having a spot size whose minimum width is in the range of about 5 to 15 microns, causing relative motion between the beam and the surface so that the beam scans a path covering substantially the entire surface; and collecting light scattered along said path for detecting anamolies. The spot size and said directing and causing steps are such that the surface is inspected at a speed not less than about 1.5 cm²/s.

Still another aspect of the invention is directed towards a method for detecting anamolies on a surface, comprising the steps of directing a beam of light towards said surface to illuminate an area of the surface defining a spot having a spot size whose minimum width is in the range of about 5 to 15 microns, causing relative motion between the beam and the surface so that the beam scans a path covering substantially the entire surface; and collecting light scattered along said path for detecting anamolies. The surface has dimensions of not less than 200 millimeters in any direction along the surface. The directing and causing steps are such that the beam scans substantially the entire surface in about 50 to 90 seconds.

Another aspect of the invention is directed towards a system for detecting anamolies on a surface, comprising means for directing a beam of light at a grazing angle toward said surface; means for causing relative motion between the beam and the surface so that the beam scans a scan path covering substantially the entire surface; and means for collecting light scattered along said path for detecting anamolies. The scan path includes a plurality of arrays of scan path segments, wherein each of at least some of such scan path segments has a span shorter than the dimensions of the surface.

One more aspect of the invention is directed towards a system for detecting anamolies on a surface of a semiconductor wafer, comprising means for directing a beam of light towards said surface to illuminate an area of the surface defining a spot having a spot size whose minimum width is in the range of about 5 to 15 microns; means for causing relative motion between the beam and the wafer so that the beam scans a path covering substantially the entire surface; and means for collecting light scattered along said path for detecting anamolies. The spot size and said directing and causing means are such that the beam scanning substantially inspects the entire surface of the wafer at a throughput in excess of about 40 wafers per hour for 150 millimeter diameter wafers, at a throughput in excess of about 20 wafers per hour for 200 millimeter diameter wafers, and at a throughput in excess of about 10 wafers per hour for 300 millimeter diameter wafers.

Yet another aspect of the invention is directed towards a system for detecting anamolies on a surface, comprising means for directing a beam of light toward said surface to illuminate an area of the surface defining a spot having a spot size whose minimum width is in the range of about 5 to 15 microns; means for causing relative motion between the beam and the surface so that the beam scans a path covering substantially the entire surface; and means for collecting light scattered along said path for detecting anamolies. The spot size and said directing and causing means are such that the surface is inspected at a speed not less than about 1.5 cm²/s.

Still one more aspect of the invention is directed towards a system for detecting anamolies on a surface, comprising means for directing a beam of light toward said surface to illuminate an area of the surface defining a spot having a spot size whose minimum width is in the range of about 5 to 15 microns; means for causing relative motion between the beam and the surface so that the beam scans a path covering substantially the entire surface; and means for collecting light scattered along said path for detecting anamolies. The surface has dimensions of not less than 200 millimeters in any direction along the surface. The directing and causing means are such that the beam scans substantially the entire surface in about 50 to 90 seconds.

It is a further object of the present invention to classify detected anomalies and determine their size while increasing the confidence and accuracy of the detection system by reducing false counts.

These objects have been achieved with an apparatus and method for detecting anomalies of sub-micron size, including pattern defects and particulate contaminants, on both patterned and unpatterned wafer surfaces. For the purposes of this application, a particulate contaminant is defined as foreign material resting on a surface, generally protruding out of the plane of the surface. A pattern defect is in the plane of the surface and is usually induced by contaminants during a photolithographic processing step. The device employs a plurality of collector channels symmetrically disposed, in the azimuth, on opposite sides of the center of a scan line. In addition to the collector channels, other detector channels are employed to enhance the detection of anomalies. The collector and detector channels are collectively referred to as inspection channels. Also, an interchannel communication apparatus is employed to compare and adjust data received from each of the inspection channels which facilitate detecting and characterizing anomalies. A laser beam illuminates a localized spot on a wafer surface with the beam having a grazing angle of incidence, and the spot is scanned over a short scan line. The wafer is orientated so that the streets of the patterns on the die are not oblique with respect to the scan line, i.e., the streets are either perpendicular or parallel to the scan line. The surface is moved in a serpentine fashion, along adjacent striped regions, as the spot is scanned over its entire area. The position of the inspection channels, as well as the polarization of the beam, allows distinguishing, inter alia, pattern defects from particulate contaminants. The detector channels include an imaging channel which combines the advantages of a scanning system and an imaging system while improving signal/background ratio of the present system. The inspection channels collect light and feed it to a light detector for producing an electrical signal corresponding to the collected light intensity. The interchannel communication apparatus is a processor which stores, in memory, the information carried by the signals from the inspection channels, with the memory addresses corresponding to spatial positions on the surface. The processor constructs maps from the stored information, representing the anomalies detected on the surface. The maps from the inspection channels are compared by performing various algorithms and logical operations, e.g., OR, AND and XOR, to characterize the detected anomalies.

In operation, each wafer is scanned with a beam incident thereon at a grazing angle and the light scattered and specularly reflected from the wafer's surface are simultaneously collected with the above mentioned inspection channels. Previously, the wafer has been aligned so that the streets on the die are not oblique with respect to the scan line. Light collected is converted into electrical signals which are further processed by dedicated electronics. A processor analyzes the information carried by the signals and produces various maps representing the light intensity detected at various beam positions. The maps are compared either in the analog domain or digitally to identify and characterize anomalies. If compared digitally, the maps are binarized which allows performing various algorithmic and logical, e.g. OR, AND and XOR, operations on the data they represent, thereby allowing a user to choose a desired level of confidence in the detected anomalies. The binarization can take place against either a constant or a variable threshold, further reducing the occurrence of false counts. The variable threshold is dependent upon the local reflectivity and can be derived from a reflectivity channel which determines local reflectivity of the surface based upon detecting specularly reflected light.

The invention has advantages over the previous scanning techniques in that it provides a small spot that scans at speeds far in excess of those of the prior art, while providing the added feature of classifying anomalies. Further, controlling the polarization of the incident beam and the light detected results in an excellent ratio of particle to pattern signal.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1A is a schematic view of an elliptical-shaped illuminated area or spot on a surface to be inspected to illustrate the invention.

FIG. 1B is a graphical illustration of the illumination intensity across the width or short axis of the elliptical spot of FIG. 1A for defining a boundary of the spot and to illustrate the invention.

FIG. 2 show partially in perspective and partially in block diagram form a system for inspecting anamolies of a semiconductor wafer surface to illustrate the preferred embodiment of the invention.

FIG. 3 is a perspective view showing in more detail the illumination and collection features of the system of FIG. 2.

FIG. 4 is a schematic view of a small portion of a semiconductor wafer surface illustrating the scan path of an illumination spot on the surface to illustrate the preferred embodiment.

FIG. 5 is a schematic view illustrating the illumination and collection angles of the system of FIG. 3.

FIG. 6 is a schematic view of three elliptical illuminated areas or spots to illustrate the scanning and data acquisition processes of this invention.

FIGS. 7A, 7B are side views illustrating two different polarization schemes employed by present invention for illuminating a surface to be inspected.

FIG. 8 is a simplified perspective plan view of the illumination and collection optics of the present invention.

FIG. 9 is a top view of the illumination and collection optics shown in FIG. 1.

FIG. 10 is a detailed view showing the scan path of a spot on a wafer surface.

FIG. 11 is a detailed view of a collection channel shown in FIG. 1.

FIGS. 12A, 12B is a plan view showing a polarization scheme employed by the present invention.

FIG. 13 is a graph of an electrical signal amplitude (I) versus beam scan position (X) on a wafer produced by the method of the present invention using the apparatus shown in FIG. 8.

FIGS. 14A-14E is a top view of a display derived from a scan of the wafer, as shown in FIG. 10.

FIG. 15 is a plan view of an imaging channel shown in FIG. 8.

For simplicity, identical components in the different figures of this invention are labeled by the same numerals.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

FIG. 1A is a schematic view of an elliptical-shaped illuminated area (or spot) of a surface inspected by the system of this invention to illustrate the invention. As explained below, the laser beam illuminating the surface inspected approach the surface at a grazing angle, so that even though the illumination beam has a generally circular cross-section, the area illuminated is elliptical in shape such as area 10 in FIG. 1A. As known to those skilled in the art, in light beams such as laser beams, the intensity of the light typically does not have a flat distribution and does not fall off abruptly to zero across the boundary of the spot illuminated, such as at boundary 10 a of spot 10 of FIG. 1A. Instead, the intensity falls off at the outer edge of the illuminated spot at a certain inclined slope, so that instead of sharp boundaries such as boundary 10 a illustrated in FIG. 1A, the boundary is typically blurred and forms a band of decreasing intensity at increasing distance away from the center of the illuminated area.

In many lasers, the laser beam produced has a Gaussian intensity distribution, such as that shown in FIG. 1B. FIG. 1B is a graphical illustration of the spatial distribution of the illumination intensity in the Y direction of a laser beam that is used in the preferred embodiment to illuminate spot 10 of a surface to be inspected as shown in FIG. 1A, and thus is also the illumination intensity distribution across spot 10 in the Y direction. As shown in FIG. 1B, the illumination intensity has been normalized so that the peak intensity is 1, and the illumination intensity has a Gaussian distribution in the Y direction as well as in the X direction. Points 12 and 14 are at spatial locations y1 and y5 at which points the illumination intensity drops to 1/e of the peak intensity, where e is the natural number. AS used herein to describe the preferred embodiment, the minimum width of spot 10 is the distance between these two points (distance between y1 and y5) is the length of the short axis of elliptical illuminated area 10 and denoted as the width w in FIG. 1A. The spot 10 is defined by the area within a boundary 10 a where the illumination is 1/e² of that of the maximum intensity of illumination at the center of the spot.

As a broader definition, “minimum width” of the elliptical spot 10 a may be defined as the minimum distance between two parallel lines that enclose between them the boundary of the area or spot. In reference to spot 10 in FIG. 1A, for example, if one were to draw two parallel lines enclosing the boundary 10 a, such as lines q1, q2, the distance between lines q1, q2 is d, which is minimized when both q1, q2 touch the boundary 10 a. The distance d is minimum for all directions of q1, q2 when the lines q1, q2 coincide with grid lines y1, y5, so that the minimum width of the spot 10 is w. Even where 10 a is not an ellipse, but is of another shape such as rectangular, square, or irregular in shape, the same broader definition is applicable.

FIG. 1B shows only the main lobe of the laser or light beam. It is known that the main lobe is also accompanied by sidelobes, so that areas of the surface outside of area or spot 10 would also be illuminated; scattering by structures of the surface of the light in the sidelobes and collected by the detectors causes noise.

In the description above, it is indicated that for a spot which is relatively small compared to the size of the surface to be inspected, it will be difficult to maintain uniformity of the spot across a scan line which spans the entire length or width of the wafer. In reference to FIGS. 1A, 1B, variation in the minimum width (as defined above) w of the main lobe of the focal plane intensity distribution, and the level of the sidelobes is a measure of the uniformity of the spot 10 as the beam scans across the surface. Where the minimum width and the sidelobes level vary little over the scan line, the spot is said to be uniform. In other words, when the spot size is relatively small compared to the size of the surface to be inspected, it will be difficult to maintain uniform width of the main lobe and uniform level of the sidelobes of the focal plane intensity distribution throughout the extent of a long scan line across the entire width of the wafer. A variation in either one of these two parameters (width of main lobe and sidelobe level) leads to a variation in detection sensitivity along the scan direction which is undesirable.

In view of the above problems, Applicants have invented a surface inspection system where the size of the area illuminated by the scanning light beam can be reduced while maintaining uniform detection sensitivity by causing the scanning light beam to scan short scan path segments having a spatial span less than the dimension of the surface it is scanning, as illustrated in the preferred embodiment in FIGS. 2 and 4, where these short scan path segments are not connected together but are located so that they form arrays of scan path segments as illustrated in more detail below.

The surface inspection system of this invention will now be described in reference to FIGS. 2 and 3. As shown in FIG. 2, system 20 includes a laser 22 providing a laser beam 24. Beam 24 is expanded by beam expander 26 and the expanded beam 28 is deflected by acousto-optic deflector (AOD) 30 into a defected beam 32. The deflected beam 32 is passed through post-AOD and polarization selection optics 34 and the resulting beam is focused by telecentric scan lens 36 onto a spot 10 on surface 40 to be inspected, such as that of a semiconductor wafer, photomask or ceramic tile, patterned or unpatterned.

In order to move the illuminated area that is focused onto surface 40 for scanning the entire surface, the AOD 30 causes the deflected beam 32 to change in direction, thereby causing the illuminated spot 10 on surface 40 to be scanned along a scan line 50. As shown in FIG. 2, scan line 50 is preferably a straight line having a length which is smaller than the dimension of surface 40 along the same direction as the scan line. Even where line 50 is curved, its span is less than the dimension of surface 40 along the same general direction. After the illuminated spot has completed scanning surface 40 along scan line 50, surface 40 of the wafer is moved along the X axis so that the illuminated area of the surface moves along arrow 52 and AOD 30 causes the illuminated spot to scan along a scan line parallel to scan line 50 and in adjacent position spaced apart from scan line 50 along the negative X axis. After the illuminated spot has covered such scan line, surface 40 is moved by a small distance so that the area of the surface to be illuminated is moved along direction 52 in order to scan an adjacent scan line at a different X position. As described below, this small distance preferably is equal to about one quarter of the height of spot 10. This process is repeated until the illuminated spot has covered strip 54; at this point in time the illuminated area is at or close to the edge 54 a. At such point, the surface 40 is moved along the Y direction by about the length of scan line 50 in order to scan and cover an adjacent strip 56, beginning at a position at or close to edge 56 a. The surface in strip 56 is then covered by short scan lines such as 50 in a similar manner until the other end or edge 56 b of strip 56 is reached at which point surface 50 is again moved along the Y direction for scanning strip 58. This process is repeated prior to the scanning of strip 54, 56, 58 and continues after the scanning of such strips until the entire surface 40 is scanned. Surface 40 is therefore scanned by scanning a plurality of arrays of short path segments the totality of which would cover substantially the entire surface 40.

FIG. 4 is an exploded view of a portion of the two strips 54, 56 and smaller portions of two other neighboring strips to illustrate in more detail the above-described scanning process. In the preferred embodiment as shown in FIG. 4, the optical beam 38 scans in only one direction as illustrated by the arrows of scan path segments 50, 50′, 50″, 50′″. Scan path 50 has an effective start location at 72 and spot 10 moves to the right therefrom until it reaches the border 55 between strips 54 and 56. Upon reaching border 55, a stage (see FIG. 3) moves the surface 40 in the X direction perpendicular to the scanning direction Y and the spot assumes the new start position 74 and moves along a scan line 50′ parallel to scan line 50. The movement of the spot 10 along scan lines 50, 50′, 50″, 50′″ and so on is achieved by means of AOD 30 as explained below.

The deflection of beam 32 by AOD 30 is controlled by chirp generator 80 which generates a chirp signal. The chirp signal is amplified by amplifier 82 and applied to the transducer portion of AOD 30 for generating sound waves to cause deflection of beam 32 in a manner known to those skilled in the art. For a detailed description of the operation of the AOD, see “Acoustooptic Scanners and Modulators,” by Milton Gottlieb in Optical Scanning, ed. by Gerald F. Marshall, Dekker 1991, pp. 615-685. Briefly, the sound waves generated by the transducer portion of AOD 30 modulates the optical refractive index of an acoustooptic crystal in a periodic fashion thereby leading to deflection of beam 32. Chirp generator 80 generates appropriate signals so that after being focused by lens 36, the deflection of beam 32 causes the focused beam to scan along a scan line such as line 50 in the manner described.

Chirp generator 80 is controlled by timing electronic circuit 84 which in the preferred embodiment includes a microprocessor. The microprocessor supplies the beginning and end frequencies f1, f2 to the chirp generator 80 for generating appropriate chirp signals to cause the deflection of beam 32 within a predetermined range of deflection angles determined by the frequencies f1, f2. The auto-position sensor (APS) optics 90 and APS electronics 92 are used to detect the level or height of surface 40 and form a part of the Related Application. Detectors such as detector 111 b collects light scattered by anamolies as well as the surface and other structures thereon along scan line 50 and provides output signals to a processor in order to detect and analyze the characteristics of the anamolies.

FIG. 3 is a perspective view of system 20 of FIG. 2 showing in more detail the arrangement of the collection or detection channels to illustrated the preferred embodiment. As shown in FIG. 3, four collection channels are used, two channels 110 a, 110 b for collecting scattered light that is within the respective ranges of azimuthal angles of −(75-105)° and (75-105)°. Two additional collection channels 111 a, 111 b are also employed for detecting forward scattered light that is within the respective ranges of azimuthal angles of −(30-60)° and (30-60)°. If desired, it is of course possible to employ four independent collection channels with other different solid angles of collection, two of said collection channels located in the forward direction to collect light in the forward direction centered substantially at ±45° azimuthally and two of the channels are located to collect light centered substantially at ±90° azimuthally.

FIG. 5 is a top view of the angles of collection of the four detectors. As shown in FIG. 5, the solid angles of collection of channels 110 a, 100 b are labeled Φ₁ and those for channels 111 a, 111 b are labeled Φ₂. To simplify the drawing, the components between laser 22 and focus beam 38 are not shown in FIG. 5. In reference to FIG. 3, system 20 also includes an imaging channel 121 and an alignment/registration channel 122.

Surface 40 may be smooth (118) or patterned (119). The incident focus beam 38 is preferably in the range of about 10-85° to the normal direction 150 to the surface 40 and more preferably within the range of 50-80° from the normal; in FIG. 3, this angle is labelled θ. The four channels of collection are preferably at elevation angles α that will collect scattered light from 3-30° from the plane of surface 40.

Improved Sensitivity of Detection

From the point of view of sensitivity of detection, it is desirable to design the illumination optics portion of system 20 so that the minimum width w of the illuminated spot 10 is minimized. The minimum width w is proportional to the focal length of lens 36 and inversely proportional to the beam diameter of beam 28 and 32. Therefore, the minimum width w can be reduced by reducing the focal length of lens 36 or increasing diameter of beam 28, or both. If the focal length of lens 36 is increased, however, this will increase the length of scan line 50 which may be undesirable. If the diameter of beam 28 becomes comparable to the clear aperture of the crystal in AOD 30, this will produce higher level sidelobes which is undesirable. As noted above, increased level of sidelobes will increase background signal level. Applicants discovered that it is preferable for the ratio k between the clear aperture of the crystal in the AOD 30 to diameter of beam 28 and 32 to exceed 1.2.

It is possible to increase the beam diameter of beam 28 and 32 by employing a long AOD crystal, while maintaining k to be above 1.2. However, in addition to cost considerations, a larger AOD crystal will cause larger losses, thereby degrading the diffraction efficiency of the AOD device. For this reason, it is desirable to employ AOD crystals that are as small as possible, while at the same time meeting the sensitivity and throughput requirements. Assuming that the beam 28 that is entering the AOD 30 has a Gaussian intensity profile, the clear aperture of the AOD, D, satisfies” D=4klv/7πwΔf,  (1) where π is the ratio of the circumference of a circle to its diameter.

Where l is the scan line of scan path segment 50, v is the acoustic velocity in the AOD crystal 30, w is the length of the short axis of the elliptical spot (or the minimum width of the spot if not elliptical) on surface 40, Δf or (f2-f1) is the bandwidth of the AOD 30. The constant k is preferably in the range 1.2-5. In one embodiment, k is 1.7 and l is in the range of about 2-10 millimeters.

Throughput Considerations

For a semiconductor wafer inspection instrument to be used for wafer inspection in actual production for inspecting the entire surface of the wafer, throughput considerations are paramount. Therefore, in addition to sensitivity capability described above, it is also desirable for the wafer inspection system of this invention to have a high throughput. The time required for inspecting semiconductor wafers first includes the time required for the illuminating light beam to scan the entire surface of the wafer. To perform the above-described short scan path segment scans, the time required to scan the entire surface depends on a number of factors. One factor obviously is the angle of illumination of the illuminating beam, or the value of 0, that is the angle between the illuminating beam and normal 150 to surface 40 to be inspected shown in FIG. 3. The larger the value of 0 (that is, the smaller the grazing angle of incidence), the more elongated would be the shape of the spot 10 in FIG. 1A, and the larger is the area being inspected. Another factor affecting throughput is the fact that the intensity distribution of the illuminating beam is typically not flat but varies, such as in the form of a Gaussian distribution. Therefore, the intensity of scattering from a location on a surface would depend on the intensity of the illuminating light at that location. In order to compensate for such variation of intensity, a number of data points are obtained from the scattering from the particular location of the surface as the spot is moved across the location in a manner illustrated in FIG. 6 described below.

FIG. 6 is a schematic view of three positions of the illuminated area on a surface to be inspected to illustrate the scanning and data gathering process of system 20. As shown in FIG. 6, at one instant in time, beam 38 illuminates an area 10 on surface 40. Area or spot 10 is divided into sixteen pixels by grid lines x1-x5, y1-y5. In this context, the term “pixel” is meaningful only in reference to the taking of data samples across the intensity distribution such as that in FIG. 1B and subsequent data processing and is borrowed from data sampling and processing in other technologies such as video technology. The pixel that is bounded by grid lines x2, x3 and y2, y3 is pixel P shown as a shaded area in FIG. 6. If there is an anamoly in this pixel P, and if the light illuminating pixel P has the intensity distribution as shown in FIG. 1B with a high intensity level between grid lines y2 and y3, light scattered by the anamoly would also have a high intensity. However, as the beam moves along the Y axis so that the area 10′ is illuminated instead, pixel P is still illuminated but at a lower intensity level of that between grid lines y1 and y2; in reference to FIG. 1B, the intensity of the illumination is that between grid lines y1 and y2 in FIG. 1B. Therefore, if the sampling rate employed by the processor 200 in FIG. 3 for processing light detected by the collection or collector channels 110 a, 110 b, 111 a, 111 b is such that a sample is taken when the illuminating beam is in position 10 and when the illuminating beam is in position 10′, then two data points will be recorded. Thus if pixel P contains an anamoly, then two data points will be taken, one when the illumination is at a higher level as illustrated by data point D2 in FIG. 1B and another one when the illumination is at a lower level, illustrated at data point D1 in FIG. 1B. If position 10 is not the starting position in the short scan path segment 50 illustrated in FIGS. 3 and 4, then two prior samples would have been taken prior to the time when the illuminating beam illuminates the surface 40 in position 10, so that the processor would have obtained two more data points D3, D4 corresponding to the prior positions of the illuminating beam when light of intensity values between grid lines y3, y4 and between y4, y5 respectively illuminates such pixel P. In other words, four separate data points D1-D4 would have been taken of the light scattered by the anamoly in pixel P as the illumination beam illuminates pixel P when scanning along the Y direction.

In most laser beams, the beam intensity has a Gaussian distribution not only in the Y direction but also in the X direction. For this reason, after the illuminating beam completes the scanning operation for scanning a short scan path segment such as segment 50 as shown in FIG. 4, and when the illuminating beam returns to position 74 for scanning the adjacent short scan path segment 50′, it is desirable for the illuminated area along path 50′ to overlap that of scan path 50 so that multiple samples or data points can again be taken also along the X direction as well as along the Y direction. Spot 10 is not drawn to scale in FIG. 4 to show overlap between adjacent scan segments. Therefore, when the illumination beam is scanning along scan line 50′ from starting position 74 as shown in FIG. 4, the area illuminated would overlap spot 10; this overlapping spot is 10″ as shown in FIG. 6, where the spot 10″ is displaced along the negative X direction relative to spot 10 by one quarter of the long axis of the ellipse 10 and 10″.

As described above, the minimum width (that is, length of short axis) of the spots 10, 10′, 10″ is w. If the angle between the illuminating light beam and normal 150 to the surface 40 to be inspected is θ as shown in FIG. 3, then the magnitude of the long axis of the ellipse 10, 10′, 10″ is w/cos θ. Therefore, in each short scan path segment, the area illuminated sequentially by the illuminating light beam is given by (w/cos θ)*l, where l is the length of the scan path segment such as 50. Thus if the radius of surface 40 is R and T is the time it takes for the beam to scan the short scan path segment, then the time it takes for the illuminating beam to scan across the entire wafer is given by NπR²Tcosθ/lw (where the duty factor and the time required for illumination optics to move the beam between strips, such as strips 54, 56 have not been accounted for). In this expression, N is the number of pixels along the X axis in each spot such as 10, 10′, 10″, since each pixel on the surface will be illuminated N number of times during the scanning process to account for the variation of intensity of illumination in the X direction as described above. In the preferred embodiment illustrated in FIG. 6, where four data points are taken in both the X and Y directions, N has the value 4.

In the scanning process described above in reference to FIGS. 2-4, it is noted that it will require time for the illumination optics to move the illumination spot between strips, such as strips 54 and 56. If τ is the time required to move the illumination spot between strips, then this additional time should be accounted for to give the total time required to scan the entire wafer surface. In the preferred embodiment described above, a stage 124 which includes a motor is used in order to move the surface so as to move the illumination spot from the position for scanning one strip on the surface to the adjacent strip as shown in FIGS. 2 and 3. For a circular wafer of radius R, the spot will need to be moved 2R/ηl times between adjacent strips to move the spot across all the strips on the entire wafer, so that the additional time required is 2Rτ/ηl, where η is the duty factor (explained below).

As known to those skilled in the art, when AOD 30 is used to cause beam 38 to scan along each short scan path segment such as 50, time will be required at the beginning of the scan for the sound waves generated by the transducer portion of the AOD to reach the far end of the AOD crystal so as to begin deflecting the beam. This is accounted for by a quantity called the duty factor η given by equation 2 below, and therefore, the total t_(s) it takes system 20 to scan the entire surface of a wafer with radius R is given by equation 3 below: $\begin{matrix} {\eta = {1 - \frac{4\quad{kl}}{\pi\quad{wT}\quad\Delta\quad f}}} & (2) \\ {t_{s} = \frac{{N\quad\pi\quad R^{2}T\quad\cos\quad\theta} + {2R\quad\tau\quad w}}{\eta\quad{lw}}} & (3) \end{matrix}$

From equation 3 above, it is evident that the shorter the time T to scan along a scan path segment such as 50, the shorter will be the time required to scan the entire wafer and therefore the higher the throughput. The time T is referred to as the chirp duration which also determines the data rate. The speed of the electronic circuit for processing the data ultimately sets a lower limit for the chirp duration.

From equation 1 above, for a given spot size, length of the scan path segment and the value of k, it is evident that the larger the bandwidth Δf or f2-f1, the smaller will be the clear aperture required of the AOD. To get maximum bandwidth from the AOD, the AOD should be operated at the highest possible frequency and one then expects to get one octave bandwidth around the center frequency of the transducer. However, the acoustic losses in the AOD crystal increase with the center frequency of operation. Large acoustic losses can cause two major problems: reduction in diffraction efficiency and thermal errors induced in the crystal. A reduction in the diffraction efficiency reduces the sensitivity of the system to small particles. When the AOD transducer is operated at high frequencies, more of the acoustic energy will be converted into heat which sets up thermal gradients in the AOD crystal. Such thermal gradients would cause errors by degrading the focal spot which in turn leads to a reduction in sensitivity for detecting anamolies. It is therefore advantageous to minimize the acoustic losses by selecting as low a center frequency of the transducer as possible. A compromise should then be found to yield acceptable detection sensitivity as well as acceptable throughput. Applicants found that a center frequency in the range of 50-300 megahertz and a bandwidth preferably within the range of 50-250 megahertz would be acceptable. The AOD 30 is preferably driven by a linear frequency modulated (FM) chirp signal from generator 80 in FIG. 2. The quantity ηl is the effective length of the scan path segment; in the preferred embodiment the effective length is in the range of 2 to 10 mm but more preferably has a value of about 5.47 mm, where l has the value of 6.2 mm.

From equation 3 above, it is seen that the larger the angle θ, the higher will be the throughput, since the illuminated spot will cover a larger area of the surface. But as noted above, the larger the spot size, the lower will be the sensitivity of detection. In the preferred embodiment, θ is in the range of 10-85° and more preferably in the range of 50-80°.

Also from equation 3 above, it is evident that the larger the number of samples taken across the illuminated spot diameter, the more time it would take to scan the wafer. In the preferred embodiment, the number of samples taken across the illuminated spot diameter along both orthogonal axes (X, Y) is in the range of 2-10. Where four samples are taken along at least the X axis, N is 4 in equation 3.

For sensitivity considerations, it is preferable for the minimum width w of the illuminated area to be in the range of 5-15 microns. If θ is in the range of 50-80°, then the illuminating beam will illuminate the scan path segments such as 50 at such speed that the surface is inspected at a speed not less than about 2.5 cm²/s, and more preferably in a range of about 2.5-3.8 cm²/s.

From equation 3 above, if the time required for moving the wafer or the illumination beam so that the illuminated spot is transferred between adjacent strips such as strips 54, 56 is taken into account, then the average speed for scanning the entire surface 40 will be reduced compared to that for scanning a short scan path segment such as segment 50. Furthermore, the speed for inspecting the entire wafer is further reduced because each pixel on the wafer is scanned multiple times as described above in reference to FIG. 6. If the value of T is about 0.3 seconds, and where the scan speed along each scan path segment is not less than 2.5 cm²/s, then the average speed for the illumination beam scanning the entire surface is not less than about 1.5 cm²/s. In the preferred embodiment, the average speed is preferably within the range of about 1.5-5 cm²/s. If the surface 40 scanned has dimensions of not less than 200 millimeters in any direction along the surface, then the illumination beam will scan the entire surface in about 50-90 seconds. As noted above, the length of the scan path segments such as segment 50 is preferably small compared to the dimensions of the surface 40 inspected. In the preferred embodiment, these segments are substantially in the range of about 2-10 millimeters.

In the preferred embodiment, generator 80 supplies a linear FM chirp signal to drive the AOD so that the chirp duration is preferably in the range of 20-200 microseconds, and more preferably in the range of about 80-120 microseconds. The beam 28 before deflection by the AOD 30 has at least one cross-sectional dimension (e.g. the longer dimension) in the range of about 4-12 millimeters. Preferably, the scan lens 36 is placed substantially at one focal length away from AOD 30 so that beam 38 scans the surface 40 telecentrically.

From the above, it will be evident that the objective of the invention of the high sensitivity and high throughput surface inspection system has been achieved while moderate data rate (e.g. 22 Mhz) at modest cost for the data sampling and processing electronics can still be achieved. This system is capable of inspecting patterned wafers with 0.35 micron design rule, such as patterned wafers for 64 and 256 megabit DRAM technology. The system is capable of detecting contaminant particles and pattern defects on memory and logic devices. With the present state-of-the-art robotic implementation for removing and replacing wafer 40 on stage 124 ready for system 20 to inspect and the inherent delay (about 25 seconds/wafer) involved therein, system 20 described above is capable of inspecting in excess of about 40 wafers per hour for 150 millimeter diameter wafers (6-inch wafers), in excess of about 20 wafers per hour for 200 millimeter diameter wafers (8-inch wafers) and in excess of about 10 wafers per hour for 300 millimeter diameter wafers (12-inch wafers).

FIGS. 7A, 7B are side views showing the polarization schemes employed by the present invention. It is found that by employing certain polarization schemes, the signal-to-background of the system can be substantially improved. The polarization scheme employed may be surface-dependent and may also be used to determine the composition of the anamoly (such as metallic as opposed to dielectric material). For pattern defects, the polarization elements included in the post-AOD and polarization selection optics 34 of FIG. 2 faces the illumination beam in a state of either P or S polarization. FIG. 7A illustrates the situation where the illumination beam 214 is in a state of S polarization where the electrical field E is perpendicular to the plane of incidence which is defined by the incident beam 214 and the specularly reflected beam 214 a; this plane of incidence is parallel to the plane of the paper. A vector representation of the beam is shown by a k vector representing the direction of propagation. The magnetic field is shown as the H vector. The electric field vector is shown as being perpendicular to the plane of incidence by representing it with a dot and labeled E. In FIG. 7B, the beam 214 is in a state of P polarization where the electric field E is in the plane of incidence and the plane of the paper. In FIG. 7B the beam 214 is shown in vector form with a propagation vector k, a magnetic field vector shown as a dot labeled H where the electric field vector E is perpendicular to the propagation vector k. Instead of P or S polarization states, the illumination beam can also have a left or righthanded circular polarization. Where the polarization state of the illumination beam is chosen to optimize signal-to-noise background and for defect detection, the collector or collection channels include polarization filters to pass light of predetermined polarization states to enhance detection capability and signal-to-noise ratio.

While in the invention described above, the scan path segments are described and illustrated as straight lines, it will be understood that it is also possible for curved scan lines to be employed, such as where the wafer is rotated about an axis instead of translated along straight lines in the X and Y directions as described above. While in the preferred embodiment described above, the short scan path segments form arrays, each array covering a substantially rectangular strip of the wafer, it will be understood that other different arrangements of the scan path segments are possible for covering the entire or substantially the entire surface 40; such and other variations are within the scope of the invention. As the spot 10 approaches the edge of surface 40, the length of the scan path segment may be reduced so that the spot does not fall outside surface 40. All the advantages described are obtained even though the segments are of different lengths if each of at least some of the segments has a span shorter than the dimensions of the surface. Also, the AOD 30 may be replaced by a polygonal scanner or galvanometer. While the invention has been described by reference to preferred embodiments, it will be understood that modifications and changes can be made without departing from the scope of the invention which is to be defined only by the appended claims.

The present invention, as shown in FIG. 8 is based on the discovery that the scattering cross section of an anomaly on a patterned surface is asymmetrical. This in part is due to the asymmetry of the anomaly itself, or, in the case of particulate contaminants, the pattern on which a particulate rests changing the effective scattering cross section of the particle. Taking advantage of this discovery, a plurality of detectors are provided that includes groups of collector channels symmetrically disposed about the circumference of the surface. Although a greater number of collector channels may be employed in each group, the preferred embodiment uses two groups of two collector channels, 1010 a-b and 1011 a-b, disposed symmetrically about the wafer surface 1012 so that each collector channel within a pair is located at the same azimuthal angle on opposite sides of the scan line, indicated by the line B. With collector channels positioned symmetrically in the azimuth, a substantial reduction in false counts can be obtained. For example, an anomaly having a symmetrical scattering cross section, will cause scattered light to impinge on a pair of collector channels, disposed symmetrically in the azimuth, with the same intensity. Anomalies with an asymmetrical scattering cross section will impinge on the same pair of collector channels with varying intensities. By comparing data representing the intensity of light impinging on symmetrically disposed collector channels, signals which are in common, such a pattern signals, may be discarded. This provides a high confidence level that the resulting signals are in fact anomalies, and not due to random scattering by surface features. The data from the channels is compared by performing various algorithms and logical operations, e.g., OR, AND and XOR. In addition, examining the data concerning the anomalies having unidentical signals in the two channels allows determining the shape and/or composition of them.

As shown in FIG. 8, a light source 1013, typically a laser, emits a beam 1014. Beam 1014 is directed towards the pre-deflector optics 1015, which consists of a half wave-plate, a spatial filter and several cylindrical lenses, in order to produce an elliptical beam with a desired polarization that is compatible with the scanner 1016. The pre-deflector optics 1015 expands the beam 1014 to obtain the appropriate numerical aperture. The post-deflector optics 1017 includes several cylindrical lenses and an air slit. Finally, the beam 1014 is brought into focus on the a wafer surface 1012 and scanned along the direction, in the plane of the wafer surface 1012, indicated by B, perpendicular to the optical axis of the beam 1014. The type of deflector employed in the apparatus is application dependent and may include a polygonal mirror or galvonmeter. However, in the preferred embodiment, deflector 1016 is an Acousto-optic Deflector. The wafer surface 1012 may be smooth 1018 or patterned 1019. In addition to the collector channels 1010 a-b and 1011 a-b, described above, detector channels are provided which include a reflectivity/autoposition channel 1020, an imaging channel 1021 and an alignment/registration channel 1022, each of which are discussed more fully below.

The beam 1014 has a wavelength of 488 nm and is produced by an Argon ion laser. The optical axis 1048 of the beam 1014 is directed onto the wafer surface 1016 at an angle, Θ. This angle, Θ, is in the range of 55-85° with respect to the normal to the wafer surface 1012, depending on the application. The scanning means includes the deflector 1016 and the translation stage 1024 upon which the wafer rests. The position of the wafer on the stage 1024 is maintained in any convenient manner, e.g., vacuum suction. The stage 1024 moves to partition the surface 1012 into striped regions, shown as 1025, 1026 and 1027 with the deflector 1016 moving the beam across the width of the striped regions.

Referring to FIG. 9, the grazing angle of the beam 1014 produces an elliptical spot 1023 on the wafer surface 1012, having a major axis perpendicular to the scan line. The deflector 1016 scans the spot 1023 across a short scan line equal in length to the width of striped region 1025 to produce specularly reflected and scattered light. The spot 1023 is scanned in the direction indicated, as the stage 1024 moves the wafer perpendicular to the scan line. This results in the spot 1023 moving within the striped region 1025, as shown in FIG. 10. The preferred embodiment scans in only one direction as indicated by scan path 1028. Scan path 1028 has an effective start location at 1029 and the spot 1022 moves to the right therefrom until is reaches the border 1031 of striped region 1025. Upon reaching border 1031, the spot 1023 the stage 1024 moves perpendicular to the scan direction and the spot assumes a new start position 1030 and moves parallel to scan line 1028, along scan line 1032. The deflector 1016 continues to scan the spot 1023 in this fashion along the entire length of striped region 1025. Upon completion of the scan of striped region 1025, the stage 1024 moves the wafer to permit the scanning of the adjacent striped region 1026. The effective start location 1033 is positioned so that the stage 1024 shall move perpendicular to each scan line in a direction opposite to that when scanning striped region 1024, thereby forming a serpentine scan. This is demonstrated by scan paths 1034 and 1035. Moving the stage 1024 to scan adjacent striped regions in opposite directions substantially reduces the amount of mechanical movement of the stage while increasing the number of wafers scanned per hour.

Referring to FIGS. 8 and 10, light scattered from the wafer surface 1012 is detected by a plurality of detectors, including collector channels 1010 a-b and 1011 a-b. An important aspect of the collector channels is that they collect light over a fixed solid angle, dependent upon, inter alia, the elevational and azimuthal angle of the channel. The optical axis of each collection channel is positioned at an angle of elevation ψ in the range of 70-90 degrees, with respect to the normal to the surface 1012. As discussed above collector channels 1010 a and 1010 b are symmetrically positioned at the same azimuthal angle with respect to beam 1014, on opposite sides of the scan line. Collector channels 1010 a and 1010 b are positioned, with respect to the beam 1014, at an azimuthal angle Φ₁ in the range of about 75 to about 100 degrees to collect laterally scattered light. Laterally scattered light is defined as light scattered at azimuthal angles in the range of about 75 to about 100 degrees, with respect to beam 1014. Similar to collector channels 1010 a and 1010 b, channels 1011 a and 1011 b are positioned on opposite sides of the scan line at the same azimuthal angle; however, the azimuthal angles Φ₂ of channels 1011 a and 1011 b are in the range of 30 to 60 degrees, to collect forwardly scattered light. Forwardly scattered light is defined as light scattered at azimuthal angles in the range of 30 to 60 degrees.

Providing the groups of collector channels, at differing azimuthal angles, facilitates classifying detected anomalies, by taking advantage of a discovery that laterally scattered light is more sensitive to detecting pattern defects, and forwardly scattered light is more sensitive to detecting particulate contaminants. To that end, channels 1010 a and 1010 b are positioned to collect laterally scattered light, representing pattern defects, and channels 1011 a and 1011 b are provided collect forwardly scattered light, representing particulate contamination.

Referring to FIG. 11, each collector channel 1010 a-b and 1011 a-b includes a lens system 1113 that collects scattered light. A series of mirrors 1114 a-c reflect the light so that it is imaged onto a photomultiplier tube (PMT) 1115. The PMT 1011 converts the light impinging thereon into an electrical signal having a voltage level that is proportional to the light intensity. Positioned at the Fourier transform plane is a programmable spatial filter 1116 and a variable aperture stop 1117. The programmable spatial filter 1116 allows the system to take advantage of spatial filtering when periodic features on the surface 1012 are scanned. In addition to the angle of elevation and the azimuthal angle of each channel, the variable aperture stop permits varying the elevational collection angle by limiting the light introduced into the collector channel, in accordance with the geometry of the features on the wafer surface 1012. Also located proximate to the Fourier transform plane is a variable polarization filter 1118. It should be noted, that it is also possible to place a PMT directly at the Fourier transform plane.

Referring to FIGS. 12A and 12B, it was found that by employing the following polarization schemes, the signal to background of the system could be substantially improved. To obtain optimum signal to background, the polarization scheme employed by the system is surface dependent. It may also be used to determine the composition of the anomaly, e.g., as composed of metallic or dielectric material. With respect to pattern defects, the polarizing element included in the post-scanner optics 1017 will place the beam 1014 in a state of either P- or S-polarization. A beam is in a state of S-polarization when its electrical field is perpendicular to the plane of incidence. The plane of incidence is parallel to the plane of the paper. It is defined by the surface 1012, beam 1014 and reflected beam 1014 b. A vector representation of the beam is shown by a {right arrow over (k)} vector representing the direction of propagation. The magnetic field is shown as the {right arrow over (H)} vector. The electric field vector is shown as being perpendicular to the plane of incidence by representing it with a dot {right arrow over (E)}. A beam is in a state of P-polarization when the electric field is in the plane of incidence. This is shown in FIG. 12B, where the beam 1014 is shown in vector form with a propagation vector {right arrow over (k)}, a magnetic field vector shown as a dot {right arrow over (H)} and the electric field vector {right arrow over (E)}, perpendicular to the propagation vector {right arrow over (k)}. Referring also to FIG. 11, if beam 1014 is incident on the surface 1012 in an S state of polarization, the variable polarization filter 1118 would allow scattered light in an S state of polarization to pass through it and attenuate all other scattered light. For example, both non-polarized or P-polarized light would be attenuated and S-polarized light would be collected by the collector channels. Alternatively, optimizing the detection of pattern defects could be accomplished with an S-polarized beam 1014 and the polarization filter allowing all scattered light to pass through it. If the beam 1014 is in a P-polarization state, the variable polarization filter 1118 would allow P-polarized light to pass through it and would attenuate all other scattered light. Alternatively, the polarization filter could allow all scattered light to be detected when beam 1014 is P-polarized. This also optimizes detection of pattern defects. Similarly, if the beam 1014 were incident on the surface 1012 with either a left or right handed circular polarization, the collector channels would be very sensitive to detecting pattern defects by allowing the polarization filter to pass all the collected light therethrough.

To detect particulate contaminants on a pattern surface, the variable polarization filter 1118 would attenuate scattered light that is not in a P state of polarization, if the beam were S-polarized. Were beam 1014 in a P state of polarization, the collector channels would collect scattered light that was S-polarized, whereby the variable polarization filter 1118 would attenuate all other scattered light impinging on the channel. For detecting particulates on a bare surface, beam 1014 would be in a P state of polarization and the collector channels would collect all light scattered therefrom to maximize the capture rate.

Referring to FIG. 13, an electrical signal 1037 is produced by one of the inspection channels corresponding to an intensity I of collected scattered light as a beam scans over a scan path. The abscissa X of the graph in FIG. 13 represents the spatial position of the beam along the scan path. Signal 1037 is made of a plurality of discrete samples taken during the scan, e.g., a plurality of scan lines, each of which were scanned at different positions on a surface.

FIGS. 8 and 14A-14E are an example of an interchannel communication scheme. Shown therein is a resulting display of a map constructed by a processor 1500 from the signals produced by the inspection channels. For purposes of this example, FIGS. 14A and 14B represent scattered light detected by a pair of collector channels. The light detected from the surface consists of a plurality of signals, shown as spots 1038. These spots may represent anomalies or false positives: light detected from features or other non-anomalies present on the surface. The spots 1038 may be stored digitally in the processor memory at addresses corresponding to spatial positions on the surface. The processor 1500 compares the data stored in memory at addresses represented by the map shown in FIG. 14A with the data stored in memory represented by the map shown in FIG. 14B. The data can be compared by performing various algorithmic or logical operations on it. A logical OR operation maximizes the capture rate at the expense of a potential increase in false counts by storing all anomalies detected between both channels in memory. The composite map shown in FIG. 14C is the end result of performing a logical OR operation on the data stored in the processors memory addresses, as represented by the maps shown in FIGS. 14A and 14B. Alternatively, a logical AND operation would discard all anomalies that are not common to both channels, which is the preferred embodiment. The composite map shown in FIG. 14D is the end result of performing a logical AND operation on the data stored in the processor's memory addresses, as represented by the maps shown in FIGS. 14A and 14B. An exclusive OR operation discards anomalies that are detected on both channels, keeping only those anomalies which are not commonly detected, as shown in FIG. 14E. These “suspect” particles would merit further examination with, inter alia, a high resolution microscope which could be employed on the system.

Referring again to FIG. 13, another manner in which to construct the maps, shown in FIGS. 14A-C, is provided in which only those positions where the signal 1037 crosses a certain threshold voltage level are stored in memory, while the remaining signal portions are discarded. For example, two threshold levels are shown: a fixed threshold level 1039, and a variable threshold level 1040. At threshold level 1039, peaks 1041-1047 are registered and stored in memory. At the variable threshold level 1040, as shown, only peaks 1041, 1043 and 1045 are stored in memory. Using the threshold voltage level as shown, fewer positions are registered to form a map thereby making the subsequent processing faster, but at the risk of failing to detect smaller anomalies. The fixed threshold level 1039 provides a greater number of positions being detected, but making the system slower. Typically, the fixed threshold level 1039 is preset before scanning a wafer, and the variable threshold level is derived from the reflectivity/autoposition channel as described below.

Although the above-described example discussed comparing maps from signals generated by a pair of collector channels, this is not the only manner in which the system may operate. It is to be understood that maps formed from signals generated by the detector channels may also be compared to identify and classify anomalies, by performing algorithms and logical operations on the data, as described above. Comparing signals to a variable threshold level provides an instructive example, because the threshold level is derived from the bright field reflectivity/autoposition channel 1020, shown in FIG. 8.

The variable threshold level is dependent upon the local reflectivity. To that end, the bright field reflectivity/autoposition channel 1020, is positioned in front of the beam 1014 to collect specularly reflected light. The bright field signal derived from this channel carries information concerning the pattern, local variations in reflectivity and height. This channel is sensitive to detecting various defects on a surface. For example, the bright field signal is sensitive to representing film thickness variations, discoloration, stains and local changes in dielectric constant. Taking advantage of bright field signal sensitivity, the bright field signal is used to produce the variable threshold level 1040, shown in FIG. 13. It is also used to produce an error height signal, corresponding to a variation in wafer height, which is fed to a z-stage to adjust the height accordingly, as well as to normalize the collector and detection channel signals, whereby the signals from the inspection channels each are divided by the bright field signal. This removes the effect of dc signal changes due to surface variations. Finally, the bright field signal can be used to construct a reflectivity map of the surface. This channel is basically an unfolded Type I confocal microscope operating in reflection mode. It is considered unfolded because the illuminating beam and reflected beams, here, are not collinear, where as, in a typical reflection confocal microscope the illuminating and reflected beams are collinear.

Referring to FIG. 15, the imaging channel is shown to include a lens assembly 1119 that images scattered light onto a linear array of sensors 1120 having pixels 1121, e.g., charge-coupled detectors. The array 1120 is positioned so that the pixels are normal to the wafer surface 1012 with the lens assembly 1119 collecting upwardly scattered light. The spot 1023 is focused and scanned in synchronism with the transferring of a charge contained in each pixel 1121. This enables charging each pixel 1121 independently of the remaining pixels, thereby activating one pixel 1122 at a time with each pixel positioned so as to receive light scattered from a unique area of the sample surface, illuminated by the spot along the scan line. In this manner each pixel forms an image on the area illuminated by the spot, wherein there is a one-to-one correlation between a pixel and the spot position along the scan line. This increases the sensitivity of the system by improving the signal to background ratio. For example, it can be shown that for a PMT-based channel, the signal to background is defined as follows: P _(s) /P _(b) =σ/A _(b) h where P_(s) is the optical power scattered by a particle, P_(b) is the background optical power, A_(b) is the area of the beam on the surface and σ and h are constants. This shows that the ratio of the scattering cross section to the area of the beam determines the signal to background ratio.

With an imaging-based channel, all the scattered power from an anomaly is imaged onto one array element. The power distributed in background, however, is imaged over a range of elements, depending upon the magnification of the system. Assuming a linear magnification M, at the image plane the background power over an area is as follows: M²A_(b) providing an effective background power per array element as P _(b) =P _(i) hA _(c) /M ² A _(b) where A_(c) is the area of an array element. Therefore, the signal to background ratio is given by the following: P _(s) /P _(b) =M ² σ/A _(c) h This shows that the signal to background ratio is independent of the spot diameter, providing an improved signal to background ratio given by: i=M ² A _(b) /A _(c)

If imaging is not desired, another PMT-based collector channel similar to the one shown in FIG. 11 may be employed in lieu of the imaging channel, to collect upwardly scattered light.

Referring again to FIG. 8, an alignment and registration channel 1022 is provided. The channel 1021 has the same design as a basic collection channel 1010 a-b and 1011 a-b, but it is positioned in the plane of incidence so that the signal produced from the patterns or features on the wafer's surface is at a maximum. The signal obtained is used to properly align the wafer surface 1012 so that the streets on the features are not oblique with the scan line. This also reduces the amount of signal collected by the collector channels, resulting from scattering by patterns.

In operation, the beam 1014 is scanned over the surface 1012, producing both scattered and specularly reflected light, which are simultaneously detected. The light scattered laterally, forwardly and upwardly is simultaneously detected by the collector channels and the imaging system. The specularly reflected light from the wafer's surface 1012 is detected by the bright field reflectivity/autoposition channel 1020. Light detected by the inspection channels is converted into electrical signals which are further processed by dedicated electronics, including a processor 1500. The processor 1500 constructs maps from the signals produced by the inspection channels. When a plurality of identical dies are present on the wafer surface 1012, a detection method may be employed whereby periodic feature comparisons are made between adjacent die. The processor compares the maps from the inspection channels either in the analog domain or digitally, by performing logical operations on the data, e.g., AND, OR and XOR, to detect anomalies. The processor forms composite maps, each representing the detected anomalies by a single group of symmetrically disposed collector channels. The composite maps are then compared so that the processor may classify the anomalies as either a pattern defect or particulate contamination. Typically, the wafer surface 1012 has been aligned so that the streets on the die are not oblique with respect to the scan line, using the information carried by the electrical signal produced by the alignment/registration channel. Proper alignment is a critical feature of this invention, because periodic feature comparison is performed to locate anomalies.

While the above described apparatus and method for detecting anomalies has been described with reference to a wafer surface, it can easily be seen that anomaly detection is also possible for photomasks and other surfaces, as well as producing reflectivity maps of these surfaces. The invention is capable of detecting anomalies of submicron size and affords the added advantage of classifying the type of anomaly and identifying its size and position on the surface. This information is highly useful to wafer manufacturers as it will permit locating the step in the wafer manufacturing process at which point an anomaly occurs. 

1-52. (canceled)
 53. An optical scanning system for detection of anomalies, such as particles, on a surface comprising: means for producing a beam of light, means for directing the beam onto a sample surface of the type having locations with periodic and non-periodic features, producing an illuminated spot thereon with the beam having a grazing angle of incidence, means for scanning the spot, means for detecting light scattered from the surface, including a plurality of detectors symmetrically positioned about the surface, to collect laterally scattered light, defining a first group of detectors, with each of the plurality of detectors producing a first electrical signal representing scattered light; and means, in electrical communication with the opposed detectors, for processing each first electrical signal independently of the other to determine the presence or absence of anomalies, the processing means including a means for producing a map from each first electrical signal, defining a first map set, the first map set representing the entire scan path over the surface, the processing means including a means for comparing maps of the first map set to identify anomalies.
 54. The optical scanning system as recited in claim 53 wherein the scanning means scans the spot in a serpentine fashion over the sample surface.
 55. The optical scanning system as recited in claim 53 wherein the detecting means includes a second group of a plurality of detectors symmetrically positioned about the surface, to collect forwardly scattered light, with each of the plurality of detectors in the second group producing a second electrical signal representing scattered light.
 56. The optical scanning system as recited in claim 55 wherein the producing means forms a second map set from the second electrical signals, representing the entire scan path, with the comparing means comparing maps of the second map set to identify anomalies.
 57. The optical scanning system as recited in claim 56 wherein the classifying means produces a third signal representing the presence of pattern defects and a fourth signal representing particulate contaminants, based upon scattered light information carried by the first and second map set.
 58. The optical scanning system as recited in claim 55 wherein each collection channel includes a variable aperture to optimize the collection angle for detecting scattered light.
 59. The optical scanning system as recited in claim 55 wherein each collection channel includes a variable polarization filter to allow light having desired predetermined states of polarization to pass therethrough, while attenuating light that does not have the desired state of polarization, and a polarizing element placed in the path of the beam, between the scanning means and the surface, to polarize the beam to a predetermined state of polarization, wherein each collection channel detects an anomaly selected from the group consisting of particulate contaminant or pattern defect.
 60. The scanning system as recited in claim 59 wherein the polarizing element is of the type to provide an S polarization state of the beam passing therethrough.
 61. The scanning system as recited in claim 59 wherein the polarizing element is of the type to provide a P polarization state of the beam passing therethrough.
 62. The scanning system as recited in claim 59 wherein the polarizing element is of the type to provide a left handed circular polarization state of the beam passing therethrough.
 63. The scanning system as recited in claim 59 wherein the polarizing element is of the type to provide a right handed circular polarization state of the beam passing therethrough.
 64. The scanning system as recited in claim 60 wherein the variable filters associated with the first group of detectors attenuates polarized scattered light with the variable filters associated with the second group of detectors attenuating scattered light which is not in a P state of polarization.
 65. The scanning system as recited in claim 61 wherein the variable filters associated with the first group of detectors attenuates polarized scattered light with the variable filters associated with the second group of detectors attenuating scattered light which is not in an S state of polarization.
 66. The scanning system as recited in claim 60 wherein the plurality of variable filters attenuates scattered light not having an S state of polarization.
 67. The scanning system as recited in claim 60 wherein the plurality of variable filters attenuates scattered light not having a P state of polarization.
 68. The scanning system as recited in claim 61 wherein the plurality of variable filters attenuates scattered light not having an S state of polarization.
 69. The scanning system as recited in claim 61 wherein the plurality of variable filters attenuates scattered light not having a P state of polarization.
 70. The optical scanning system as recited in claim 53 wherein the first group of detectors are a pair of opposed detectors oriented to receive light scattered in a direction less than 30 degrees above the surface and symmetrically disposed on opposite sides of the scanning beam at an azimuthal angle of 75 to 95 degrees, with respect to the scanning beam.
 71. The optical scanning system as recited in claim 54 wherein the second group of detectors are a pair of spaced-apart detectors oriented to receive light scattered in a direction less than 30 degrees above the surface arid symmetrically disposed on opposite sides of the scanning beam at an azimuthal angle of 30 to 60 degrees, with respect to the scanning system.
 72. The optical scanning system as recited in claim 53 further including an autoposition detector for collecting specularly reflected light from the spot, the autoposition detector having a means for measuring a change in height of the surface.
 73. The optical scanning system as recited in claim 72 wherein the autoposition detector includes a reflectivity channel producing a normalization signal representing a threshold voltage based upon the intensity of reflecting light, wherein those anomalies represented by a signal having a voltage level less than the threshold voltage are discarded.
 74. The optical scanning system as recited in claim 53 including a normal detector positioned normal to said surface to collect upwardly scattered light from the spot.
 75. The optical scanning system as recited in claim 74 wherein the normal detector includes an array of sensors for forming pixels consisting of a plurality of electronic bit signals corresponding to the image viewed by the sensors, and means for transferring a charge from each said pixel in synchronism with the scanning spot so that each pixel receives light scattered from a unique area illuminated by the spot along the scan line.
 76. The optical scanning system as recited in claim 53 wherein the features include a plurality of dies each having streets forming a grid and further including an alignment detector positioned above said surface to maximize collection of light scattered from the features on the surface, the alignment detector being connected to a means for aligning the surface so that the scan line is not oblique to the streets of the die.
 77. The optical scanning system as recited in claim 56 further including an autoposition detector, collecting specularly reflected light, and a normal detector positioned normal to the surface, collecting upwardly scattered light, the autoposition detector producing a normalization signal representing selected light and the normal detector producing a third electrical signal representing upwardly scattered light, with the producing means forming a third map from the normalization signal and a fourth map from the third electrical signal, whereby the producing means performs logical operations, comparing the maps to detect and characterize anomalies, on the surface.
 78. The optical scanning system as recited in claim 77 wherein the logical operations are digital operations from the group consisting of AND, OR and XOR.
 79. An optical scanning system for detection of anomalies on a surface comprising: means for producing a beam of light, means for directing the beam onto a sample surface of the type having locations with periodic and non-periodic features, producing an illuminated spot thereon with the beam having a grazing angle of incidence, means for scanning said spot, in a serpentine fashion, over the sample surface, each scan defining a short scan line, means for detecting light scattered from the surface, including a pair of opposed collection channels symmetrically positioned about the surface, on opposite sides of the beam to collect laterally scattered light and a pair of spaced apart collection channels symmetrically positioned about the surface, on opposite sides of the beam to collect forwardly scattered light with the opposed collection channels producing a first and a second electrical signal and the spaced apart collection channels producing a third and a fourth electrical signals, each electrical signal representing scattered light impinging on a collection channel, and means, in electrical communication with each collection channel, for processing each of the signals therefrom independently of the remaining signals to determine the presence or absence of anomalies.
 80. The optical scanning system as recited in claim 79 wherein the opposed collection channels are positioned on opposite sides of the beam of an azimuthal angle in the range of 75-95 degrees, and the spaced apart channels are positioned on opposite sides of the beam at an azimuthal angle in the range of 30-60 degrees.
 81. The optical scanning system as recited in claim 79 wherein the processing means includes a means for producing a map from each electrical signal representing the entire scan path over the surface with a first map generated from the first signal, the second map generated from the second signal, the third map generated from the third signal and the fourth map generated from the fourth signal, the processing means including a means for comparing the first map with the second map, with the differences between them being anomalies recorded as a first composite map, and comparing the third map with the fourth map, with the differences between them being anomalies recorded as a second composite map.
 82. The optical scanning system as recited in claim 81 wherein the processing means further includes a classifying means for comparing the first composite map with the second composite map, thereby classifying the anomalies present as a material selected from the group consisting of particulate contaminant or pattern defect.
 83. The optical scanning system as recited in claim 82 including a normal detector positioned normal to said surface to collect upwardly scattered light from the spot.
 84. The optical scanning system as recited in claim 83 wherein the normal detector includes an array of sensors for forming pixels consisting of a plurality of electronic bit signals corresponding to the image viewed by the sensors, and means for transferring a charge from each of the pixels in synchronism with the scanning spot so that each pixel receives light scattered from a unique area illuminated by the spot along the scan line.
 85. The optical scanning system as recited in claim 84 wherein each collection channel includes a variable aperture to optimize the collection angle for detecting scattered light.
 86. The optical scanning system as recited in claim 85 wherein each collection channel includes a variable polarization filter, each of the variable filters being adjusted to permit detection of light having a state of predetermined polarization, and a polarizing element placed in the path of the beam, between the producing means and the surface to place the beam in a predetermined state of polarization, wherein each collection channel detects an anomaly selected from the group consisting of particulate contaminant and pattern defect.
 87. An optical scanning system for detection of anomalies on a surface comprising: means for producing a beam of light, means for directing the beam onto a sample surface, producing an illuminated spot thereon, means in the path of the light beam for scanning said spot along a scan line, a detector having an array of sensors for forming pixels consisting of a plurality of electronic bit signals corresponding to the image viewed by the array of sensors, and means for transferring a charge from each of the pixels in synchronism with the scanning spot so that each pixel receives light scattered by the spot along the scan line, producing an electrical signal representing scattered light, and means, in electrical communication with the detector, for processing the electrical signals received from the sensors, producing an image.
 88. The optical scanning system as recited in claim 87 wherein the array of sensors are positioned normal to the sample surface for collecting upwardly scattered light therefrom.
 89. The optical scanning system as recited in claim 87 wherein each pixel is positioned so as to receive light scattered from a unique area of the sample surface, illuminated by the spot along the scan line.
 90. A method of detecting anomalies on a surface comprising: scanning an illuminated spot over a substantially flat surface of the type having locations with periodic and non-periodic features, simultaneously detecting laterally scattered light on opposite sides of the scanning spot at an azimuthal angle in the range 75-95 degrees, with respect to the scanning spot, producing a first and a second electrical signal representing scattered light, producing a first map from the first signal, representing light detected along the entire scan path and a second map from the second signal, representing light detected along the entire scan path, and comparing the first map with the second map, differences between them representing anomalies on the surface, defining a first composite map.
 91. The method as recited in claim 90 further including the steps of simultaneously detecting forwardly scattered light on opposite sides of the scanning spot at an azimuthal angle in the range of 30-60 degrees, producing third and fourth electrical signal, producing a third map from the third signal, representing light detected along the entire scan path and a fourth map from the fourth signal, representing light detected along the entire scan path, and comparing the third map with the fourth map, differences between them representing anomalies on the surface, defining a second composite map.
 92. The method as recited in claim 91 further including the step of classifying the anomalies by comparing the first composite map with the second composite map, wherein each anomaly detected is classified as a material selected from the group consisting of particulate contaminant or pattern defect.
 93. The method as recited in claim 90 further including the steps of, detecting specularly reflected light and producing a normalization signal having a threshold voltage level based upon the intensity of detected reflected light, and normalizing the differences found by comparing the voltage level of the first, second, third and fourth signals with the threshold level, wherein anomalies represented by a signal having a voltage level less than the threshold level are discarded.
 94. The method as recited in claim 87 further including the step of moving the surface so that it is scanned in a serpentine fashion over the entire surface.
 95. The method as recited in claim 91 wherein both laterally and forwardly scattered light is detected at an angle of elevation of less than 30 degrees, with respect to the surface.
 96. The method as recited in claim 91 wherein the features include a plurality of dies each of which has a plurality of streets positioned at rights angles to one another and further including the step of, aligning the surface so that a scan line is not oblique with respect to the streets.
 97. A method of detecting anomalies on a surface comprising: scanning an illuminated spot over a surface of, the type having locations with periodic and non-periodic features, simultaneously detecting laterally scattered light, forwardly scattered light, upwardly scattered light and specularly reflected light with an optical detecting system, and producing a first plurality of electrical signals, representing laterally scattered light, a second plurality of electrical signals, representing forwardly scattered light, a third signal representing upwardly scattered light and a normalization signal representing specularly reflected light, producing a first map from the first signal, a second map from the second signal, a third map from the third signal and a fourth map from the normalization signal with each map representing light detected along the entire scan path, and comparing the first map, second map, third map and fourth map to determine the presence and type of anomaly.
 98. The method as recited in claim 97 wherein the normalization signal defines a threshold voltage and further including the step of comparing the voltage level of the first, second, third and fourth signals with the normalization signal, wherein anomalies represented by a signal having a voltage level less than the threshold are discarded.
 99. The method as recited in claim 97 wherein the comparing step includes performing logical operations on data represented by the first, second, third and fourth maps, the logical operations selected from the group consisting of AND, OR and XOR.
 100. The method as recited in claim 97 wherein laterally scattered light is detected at the same azimuthal angle, simultaneously on opposite sides on the scanning spot with the azimuthal angle being in the range 75-95 degrees, with respect to the scanning spot, and the forwardly scattered light is detected at the same azimuthal angle, simultaneously on opposite sides on the scanning spot, with the azimuthal angle being in the range 30-60 degrees, with respect to the scanning spot.
 101. The method as recited in claim 100 wherein laterally and forwardly, scattered light is detected at an angle of elevation of less than 30 degrees, with respect to the surface.
 102. The method as recited in claim 97 wherein the upwardly scattered light is detected by an array of sensors for forming pixels consisting of a plurality of electronic bit signal corresponding the an image viewed by the sensor and further including the step transferring a charge from each pixel in synchronism with the scanning of that spot so that each pixel receives light scattered from a unique area illuminated by the spot along the scan line.
 103. The method as recited in claim 97 further including the step of producing a reflectivity map of the surface based upon the information carried by the normalization signal.
 104. A surface inspection method for distinguishing between particles on a surface and defects in the surface, comprising: receiving the surface; causing the surface to be scanned by a P-polarized beam of light at an oblique angle to the surface; collecting light scattered from the surface at a first imaging channel, and at least at a second oblique channel, said second channel offset angularly from said first imaging channel; converting the collected light from the two channels into respective signals representative of light scattered into the two channels; comparing the two signals and determining whether a defect is one of a particle or a defect in the surface based at least in part on said comparing.
 105. The method of claim 104, wherein said second channel is offset at least forwardly of said first channel.
 106. The method of claim 105, wherein said second channel receives substantially only forward scattered light.
 107. The method of claim 104, wherein said first channel includes a scattering direction substantially perpendicular to the surface.
 108. The method of claims 104-107, wherein said signals are representative of intensities of the light scattered into said channels.
 109. The method of claim 104-107, wherein said P-polarized light is incident upon said surface at an angle in a range of about 55 to 85 degrees of perpendicular.
 110. The method of claim 104-107, wherein the light collected in said second channel is scattered at an angle in the range of about 3 to 30 degrees from the surface.
 111. The method of claim 104, additionally including forming a first display map identifying the locations of surface defects on the surface.
 112. The method of claim 111, including forming a second display map identifying the locations of particle defects on the surface.
 113. The method of claim 104, wherein the scanning further comprises translationally transporting the surface along a path.
 114. A surface inspection method for distinguishing between particles on a surface and defects in the surface, comprising: receiving the surface at a surface inspection system; scanning the surface at the inspection system with a beam of P-polarized light at an angle of incidence oblique to the surface; collecting light scattered from the surface at the inspection system at a first central zone, and at least at a second oblique zone offset angularly from said first zone; converting the collected light components from said zones into respective signals representative of light scattered into said zones; comparing said signals and determining whether a defect is one of a particle or a defect in the surface based at least in part on said comparing. 